Korpus: isl_newscrawl_2011

Weitere Korpora

3.7.3 Distribution of the string similarity for different rank ranges

Distribution of the Levenshtein distance for words of rank

String similarity for top-1.000 words
Distance Percentage of words
0 29.1139
1 31.6456
2 39.2405
String similarity for top-10.000 words
Distance Percentage of words
0 11.9120
1 32.3027
2 55.7853
String similarity for top-100.000 words
Distance Percentage of words
0 6.4750
1 28.6802
2 64.8448
String similarity for top-1.000.000 words
Distance Percentage of words
0 6.3322
1 28.5959
2 65.0719
284 msec needed at 2018-03-10 11:10